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Abstract 

We model the growth of a cell population by a piecewise deter- 
ministic Markov branching tree. Each cell splits into two offsprings 
at a division rate B(x) that depends on its size x. The size of each 
cell grows exponentially in time, at a rate that varies for each individ- 
ual. We show that the mean empirical measure of the model satisfies 
a growth-fragmentation type equation if structured in both size and 
growth rate as state variables. We construct a nonparametric estima- 
tor of the division rate B(x) based on the observation of the popula- 
tion over different sampling schemes of size n on the genealogical tree. 
Our estimator nearly achieves the rate 

n -s/(2s+l) j n 

squared-loss error 

asymptotically. When the growth rate is assumed to be identical for ev- 
ery cell, we retrieve the classical growth-fragmentation model and our 
estimator improves on the rate n~ s '( 2s+3 > obtained in [10, 12] through 
indirect observation schemes. Our method is consistently tested nu- 
merically and implemented on Escherichia coli data. 
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1 Introduction 



1.1 Size-structured models and their inference 

In mathematical biology, physiologically structured equations [23] allow to 
describe the temporal evolution of a population characterised by state vari- 
ables such as age, size, growth, maturity, protein content and so on (see 
for instance [23, 27] and the references therein). A paradigmatic example is 
given by the growth-fragmentation or size-structured cell division equation. 
For the evolution of a bacterial population it reads 

{d t n(t, x) + rd x (x n(t, x)) + B(x)n(t, x) = AB(2x)n(t, 2x) 
(1) 
n(0,x) = n^(x),x > 0, 

and it quantifies the density n(t, x) of cells having size x (the state variable) 
at time t. A common stochastic mechanism for every single cell is attached 
to (1): 

1. The size x = x(t) of a cell at time t evolves exponentially according 
to the deterministic evolution dx(t) = rx(t)dt, where r > is the 
growth rate of each cell, that quantifies their ability to ingest a common 
nutrient. 

2. Each cell splits into two offsprings according to a division rate B{x) 
that depends on its current size x. 

3. At division, a cell of size x gives birth to two offsprings of size x/2 
each, what is called binary fission. 

Model (1) is thus entirely determined by the parameters (r, B(x),x E [0, oo)) . 
Typically, r is assumed to be known or guessed [11], and thus inference about 
(1) mainly concerns the estimation of the division rate B(x) that has to be 
taken from a nonparametric perspective. 

By use of the general relative entropy principle, Michel, Mischler and 
Perthame showed that the approximation n(t, x)e~ X(>t ~ N(x) is valid [26], 
with Ao > 0, and where (Ao,-/V) is the dominant eigenpair related to the 
corresponding eigenvalue problem, see [25, 27, 9, 22, 5, 1]. The "stationary" 
density N(x) of typical cells after some time has elapsed enables to recover 
(B(x), i£D) for a compact T> C (0, oo) by means of the regularisation of an 
inverse problem of ill-posedness degree 1. From a deterministic perspective, 
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this is carried out in [28, 12, 13]. From a statistical inference perspective, if 
an n-sample of the distribution N(x) is observed and if B{x) has smoothness 
s > in a Sobolev sense, it is proved in [10] that B(x) can be recovered in 
squared-error loss over compact sets with a rate of convergence n - s /( 2s + 3 ) . 
Both deterministic and stochastic methodology of [12] and [10] are motivated 
by experimental designs and data such as in [21, 11]. However, they do not 
take into account the following two important aspects: 

• Bacterial growth exhibits variations in the individual growth rate r 
as demonstrated for instance in [29]. One would like to incorporate 
variability in the growth rate within the system at the level of a single 
cell. This requires to modify Model (1). 

• Recent evolution of experimental technology enables to track the whole 
genealogy of cell populations (along prescribed lines of descendants for 
instance) , affording the observation of other state variables such as size 
at division, lifetime of a single individual and so on [31]. Making the 
best possible use of such measures is of great potential impact, and 
needs a complementary approach. 

The availability of observation schemes at the level of cell individuals sug- 
gests an enhancement of the statistical inference of (B(x),x £ I?), possibly 
enabling to improve on the rates of convergence obtained by indirect mea- 
surements such as in [10, 12]. This is the purpose of the present paper. 

1.2 Results of the paper 
Statistical setting 

Let 

oo 

U={J{0,l} k 

denote the binary genealogical tree (with {0, 1}° := {0}). We identify each 
node u G U with a cell that has a size at birth £ u and a lifetime £ u . In 
the paper, we consider the problem of estimating (B(x),x S [0, oo)) over 
compact sets of (0, oo). Our inference procedure is based on the observation 
of 

{(Zu,(u),u£U n ). (2) 

where U n C U denotes a connected subset of size n containing the root 
u = 0. Asymptotics are taken as n — > oo. Two important observation 
schemes are considered: the sparse tree case, when we follow the system 
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along a given branch with n individuals, and the full tree case, where we 
follow the evolution of the whole binary tree up to the -/V n -th generation, 
with N n ~ log 2 n. In this setting, we are able to generalise Model (1) and 
allow the growth rate t to vary with each cell u £U. We assume that a given 
cell u has a random growth rate t u = v G E C (0, oo) (later constrained to 
live on a compact set). Moreover, this value v is inherited from the growth 
rate v' of its parent according to a distribution p(v' ', dv). Since a cell splits 
into two offsprings of the same size, letting u~ denote the parent of u, we 
have the fundamenal relationship 

2£« = £ u - exp (t u -C„-) (3) 

that enables to recover the growth rate t u of each individual in lA n since 
U n is connected by assumption, possibly leaving out the last generation of 
observed individuals, but this has asymptotically no effect on a large sample 
size approach. 

Variability in the growth rate 

In the case where the growth rate can vary for each cell, the density n(t, x) 
of cells of size x at time t does not follow Eq. (1) anymore and an extended 
framework needs to be considered. To that end, we structure the system with 
an additional variable t u = v , which represents the growth rate and depends 
on each individual cell u. We construct in Section 2 a branching Markov 
chain t u ), u G U) that incorporates variability for the growth rate in the 
mechanism described in Section 1.1. Equivalently to the genealogical tree, 
the system may be described in continuous time by a piecewise deterministic 
Markov process 

[X(t),V(t)) = ((^(t),^^)),^),^)),...), 

which models the process of sizes and growth rates of the living particles in 
the system at time t, with value in \^j^L S k , where S = [0, oo) x £ is the 
state space of size times growth rate. Stochastic systems of this kind that 
correspond to branching Markov chains are fairly well known, both from 
a theoretical angle and in applications; a selected list of contributions is 
[2, 7, 24] and the references therein. 

By fragmentation techniques inspired by Bertoin [4], see also Haas [17], 
we relate the process (X, V) to a growth-fragmentation equation as follows. 
Define 

oo 

(n(t, •),¥>) =E[Y fV (X i (t),V i (f)j 
i=l 
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as the expectation of the empirical measure of the process (X, V) over 
smooth test functions defined on S. We prove in Theorem 1 that, under 
appropriate regularity conditions, the measure n(t, •) that we identify with 
the temporal evolution of the density n(t, x, v) of cells having size x and 
growth rate v at time t is governed (in a weak sense 1 ) by 

dtn(t, x, v) + v d x (x n(t, x,v)) + B(x)n(t, x, v) 
< = 4B(2x) f £ p{v',v)n(t,2x,dv'), (4) 

n(0, x, v) = n(°'(x, v), x > 0. 

This result somehow legitimates our methodology: by enabling each cell to 
have its own growth rate and by building-up new statistical estimators in 
this context, we still have a translation in terms of the approach in [12]. 
In particular, if we assume a constant growth rate r > 0, we then take 
p(v',dv) = 5 T (dv) (where 5 denotes the Dirac mass) and we retrieve the 
standard growth- fragmentation equation (1). The proof of Theorem 1 is 
obtained via a so-called many-to-one formula, established in Proposition 3 
in Section 5.1. Indeed, thanks to the branching property of the system, 
it is possible to relate the behaviour of additive functionals like the mean 
empirical measure to the behaviour of a so-called tagged cell (like a tagged 
fragment in fragmentation process), that consists in following the behaviour 
of a single line of descendants along a branch where each node is picked 
at random, according to a uniform distribution. This approach, inspired 
by fragmentation techniques, is quite specific to our model and enables to 
obtain a relatively direct proof of Theorem 4. 



Nonparametric estimation of the growth rate 

In Section 3 we take over the problem of estimating (B(x),x 6 T>) for some 
compact V C (0, oo). We assume we have data of the form (2), and that the 

1 For every t > 0, we actually have a Radon measure n(t, dx, dv) on S — [0, oo) x £ : If 
ip(x,v) is a function denned on S, we define (n(t,-),tp) — j s <p(x, v)n(t, dx, dv) whenever 
the integral is meaningful. Thus (4) has the following sense: for every sufficiently smooth 
test function ip with compact support in £ , we have 

dtn(t, dx, dv)ip(x, v) — vxn{t, dx, dv)d x ip(x, v) + B(x)n(t, dx, dv)ip(x, v) 
= 4 J (B(2x) j p{v' ,dv)n(t,2dx,dv')ip{x,v)). 
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mean evolution of the system is governed by (4). The growth rate kernel p 
is unknown and treated as a nuisance parameter. A fundamental object is 
the transition kernel 



V B (x,dx') = P((C«,r„) E dx'\ (e, 



• a 




) 



of the size and growth rate distribution r u ) at the birth of a descendant 
u £U, given the size of birth and growth rate of its parent (£ U -,T U -). We 
define in Section 3.3 a class of division rates and growth rate kernels such 
that if (B, p) belongs to this class, then the transition Vb is geometrically 
ergodic and has a unique invariant measure us{dx) = vb{x, dv)dx. From 
the invariant measure equation 



bution vb- A strategy for constructing and estimator B consists in replacing 
the right-hand size of (5) by its empirical counterpart, the numerator being 
estimated via a kernel estimator of the first maginal of v B {dx). Under local 
Holder smoothness assumption on B of order s > 0, we prove in Theorem 
2 that for a suitable choice of bandwidth in the estimation of the invariant 
density, our estimator achieves the rate n~ s ^ 2s+1 ^ in squared-error loss over 
appropriate compact sets T> C (0, oo), up to an inessential logarithmic term 
when the full tree observation scheme is considered. We see in particular 
that we improve on the rate obtained in [10]. Our result quantifies the im- 
provement obtained when estimating B(x) from data {(Cu,Cu),u E I4 n ), as 
opposed to overall measurements of the system after some time has elapsed 
as in [10]. We provide a quantitative argument based on the analysis of a 
PDE that explains the reduction of ill-posedness achieved by our method 
over [10] in Section 4.2. 

In order to obtain the upper bound of Theorem 2, a major technical 
difficulty is that we need to establish uniform rates of convergence of the 
empirical counterparts to their limits in the numerator and denominator 
of (5) when the data are spread along a binary tree. This can be done 
via covariance inequalities that exploit the fact that the transition Vb is 



vbVb = v B 



we obtain in Proposition 2 the explicit representation 



where vb{x) = Jg v B {x, dv) denotes the first marginal of the invariant distri- 
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geometrically ergodic (Proposition 4) using standard Markov techniques, 
see [24, 3]. The associated chain is however not reversible, and this yields 
an extraneous difficulty: the decay of the correlations between ^(^«? 

t u ) and 

(p(£ v , t v ) for two nodes u,v GU n are expressed in terms of the sup-norm of (p, 
whenever | <£>(#)[ < Y(x) is dominated by a certain Lyapunov function V for 
the transition Vb- However, the typical functions ip we use are kernels that 
depend on n and that are not uniformly bounded in sup- norm as n — > oo. 
This partly explains the relative length of the technical Sections 5.5 and 5.6. 

1.3 Organisation of the paper 

In Section 2, we construct the model ((<^u ; t u ),u G U) of sizes and growth 
rates of the cells as a Markov chain along the genealogical tree. The dis- 
crete model can be embedded into a continuous time piecewise deterministic 
Markov process (X, V) of sizes and growth rates of the cells present at any 
time within the system. In Theorem 1 we explicit the relation between the 
mean empirical measure of (X, V) and the growth-fragmentation type equa- 
tion 4. In Section 3, we explicitly construct an estimator B n of B by means 
of the representation given by (5) in Section 3.2. Two observation schemes 
are considered and discussed in Section 3.1, whether we consider data along 
a single branch (the sparse tree case) or along the whole genealogy (the full 
tree case). The specific assumptions and the class of admissible division 
rates B and growth rate kernels p are discussed in Section 3.3, and an upper 
bound for B n in squared-error loss is given in our main Theorem 2. Sec- 
tion 4 shows and discusses the numerical implementation of our method on 
simulated data. In particular, ignoring the variability in the reconstruction 
dramatically deterioriates the accuracy of estimation of B. We also explain 
from a deterministic point perspective the rate improvement of our method 
compared with [10] by means of a PDE analysis argument in Section 4.2. 
The parameters are inspired from real data experiments on Escherichia coli 
cell cultures. Section 5 is devoted to the proofs. 

2 A Markov model on a tree 
2.1 The genealogical construction 

Recall that U := IXLq{°> l } n ( with {°> l }° '■= M) denotes the infinite 
binary genealogical tree. Each node u G U is identified with a cell of the 
population and has a mark 



S 



where £ u is the size at birth, t u the growth rate, b u the birthtime and Cu the 
lifetime of u. The evolution G 

+ Cu)) of the size of u during its 

lifetime is governed by 

$ = ^ exp (r u (t - b u )) for t G [6 U , 6 U + £«). (6) 

Each cell splits into two offsprings of the same size according to a division 
rate B(x) for x G (0,oo). Equivalently 

F(Cu G [t,t + dt] \(u > t,£ u = x,t u = v) = B(xexp(vt))dt. (7) 

At division, a cell splits into two offsprings of the same size. If u~ denotes 
the parent of u, we thus have 

2 £« = 4- ex P (^-Cu-) ( 8 ) 

Finally, the growth rate t u of u is inherited from its parent t u - according 
to a Markov kernel 

p(v,dv') = F(T u edv'\T u - =v), (9) 

where v > and du') is a probability measure on (0, oo) for each v > 
0. Eq. (6), (7), (8) and (9) completely determine the dynamics of the 
model t u ), u 6 W), as a Markov chain on a tree, given an additional 
initial condition tq) on the root. The chain is embedded into a piecewise 
deterministic continuous Markov process thanks to (6) by setting 

T t ) = {in exp (r u (t - b u )) , t u ) for t G [b u , b u + Cu) 

and (0, 0) otherwise. Define 

(X(t),V(t)) = ((X 1 (t),V 1 (t)),(X 2 (t),V 2 (t)),...) 

as the process of sizes and growth rates of the living particles in the system 
at time t. We have an identity between point measures 

oo 

Yl 1 {X l (t)>0} 5 (X t (t),V l (t)) = Yl 1 {bu<t<b u +(u} 5 (tf,r?) (10) 
i=l u£U 

where 5 denotes the Dirac mass. In the sequel, the following basic assump- 
tion is in force. 
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Assumption 1 (Basic assumption on B and p). The division rate x ~~> B{x) 
is continuous. We have -B(O) = and J°° x~ 1 B(x)dx = oo. The Markov 
kernel p{v,dv') is defined on a compact set £ C (0,oo). 

Proposition 1. Work under Assumption 1. The law of 

{(X(t),V(t)),t>0) or ((Ur u ),ueU) or ((#, (?), t > 0, u € U) 

is well-defined on an appropriate probability space. 

If p is a probability measure on the state space S = [0, oo) x £, we shall 
denote indifferently by the law of any of the three processes above where 
the root (Cd^ T d) has distribution p. The construction is classical (see for 
instance [4] and the references therein) and is outlined in Appendix 6.1. 

2.2 The behaviour of the mean empirical measure 

Denote by Cq(S) the set of real- valued test functions with compact support 
in the interior of S. 

Theorem 1 (Behaviour of the empirical mean). Work under Assumption 1. 
Let p be a probability distribution on S. Define the distribution n(t,dx,dv) 
by 

oo 

(n(t,-),<p) =E lt [j2<p(X i (t),V i (t))\ for every <peC^(S). 
i=i 

Then n(t, •) solves (in a weak sense) 

dtn(t, x, v) + v d x (xn(t, x, v)) + B(x)n(t, x, v) 

= AB(2x) J £ p(v',v)n(t,2x,dv'), 

n(0,x,v) = n^°'(x,v),x > 0. 

with initial condition n^°'(dx,dv) = p(dx,dv). 

Theorem 1 somehow legitimates our methodology: by enabling each cell 
to have its own growth rate and by building-up new statistical estimators 
in this context, we still have a translation in terms of the approach in [12]. 
In particular, we will be able to compare our estimation results with [10]. 
Our proof is based on fragmentation techniques, inspired by Bertoin [4] and 
Haas [17]. Alternative approaches to the same kind of questions include the 
probabilistic studies of Chauvin et al. [6], Bansaye et al. [2] or Harris and 
Roberts [18] and the references therein. 
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3 Statistical estimation of the division rate 



3.1 Two observation schemes 

Let U n C U denote a subset of size n of connected nodes: if u belongs to 
U n , so does its parent u~ . We look for a nonpar ametric estimator of the 
division rate 

V ~* B(y) = B(y,(€ u ,T u ),ueU n )) for yG(0,oo) 
Statistical inference is based on the observation scheme 

{((,u,T u ),U G U n ) 

and asymptotic study is undertaken as the population size of the sample 
n — > oo. We are interested in two specific observation schemes. 

The full tree case. We observe every pair (^ u ,t u ) over the first N n genera- 
tions of the tree: 

U n = {u G U, \u\ < N n } 

with the notation \u\ = n if u = (ito, ui,..., u n ) G U, and N n is chosen such 
that that 2 Nn has order n. □ 

The sparse tree case. We follow the first n offsprings of a single cell, along 
a fixed line of descendants. This means that for some u G U with \u\ = n, 
we observe every size £ n and growth rate t u of each node (uq), (uo,ui), 
(uo,ui,U2) and so on up to a final node u = (uq,u%, . . . , u n ). □ 

Remark 1. For every n > 1, we tacitly assume that there exists a (random) 
time T n < oo almost surely, such that for t > T n , the observation scheme 
U n is well-defined. This is a consequence of the behaviour of B near infinity 
that we impose later on in (17) below. 

3.2 Estimation of the division rate 
Identification of the division rate 

We denote by x = (x,v) an element of the state space S = [0, oo) x £. 
Introduce the transition kernel 

V B (x,dx') = F((Cu,T u ) G dx'\ (Cu-,T U -) = X) 
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of the size and growth rate distribution (£«, t u ) at the birth of a descendant 
u €U, given the size at birth and growth rate of its parent , t u -). From 
(7), we infer that P(C U - £ dt | £ u - = x, t u - = v) is equal to 

B(xexp(vt)) exp ^ — J B(xexp(vs)}ds^dt. 
Using formula (8), by a simple change of variables 
P(£ u G dx'\ f u _ = z,t u - = v) = B ^ X f ^ l{ x >> x / 2 \ exp ( - f I ^-ds)dx' . 

vx Jx/2 

Incorporating (9), we obtain an explicit formula for 

Vb{x, dx') = Vb((x, v), x' , dv')dx' , 

with 

V B ((x,v),x' \dv') = B<y2X / - l{ x >> x / 2 } exp ( - / ^f-ds)p{v,dv'). (11) 

Assume further that 7^ admits an invariant probability measure u B (dx), 
i.e. a solution to 

vbPb = vb, (12) 



where 



pV B {dy) = J p(dx)V B {x,dy) 



denotes the left action of positive measures p(dx) on S for the transition 
V B . 

Proposition 2. Work under Assumption 1. Then V B admits an invariant 
probability measure u B of the form v B (dx) = u B (x,dv)dx and we have 

I v B(M r 1 i 

VB[y) = —jj—&u B [—l{t u _<2 U ,h> V }\ (13) 

where E^ B [-] denotes expectation when the initial condition (£0,T0) /ias dis- 
tribution v B and where we have set v B {y) = j £ u B (y,dv') in (13) for the 
marginal density of the invariant probability measure u B with respect to y. 

We exhibit below a class of division rates B and growth rate kernels p 
that guarantees the existence of such an invariant probability measure. 



12 



Construction of a nonparametric estimator 

Inverting (13) and applying an appropriate change of variables, we obtain 
B(y) = I ■ , -s^ 2 ' T , (14) 



: ^ 1 {5 U -<J/, fn>2//2} 



provided the denominator is positive. Representation (14) suggests an esti- 
mation procedure, replacing the marginal density vs{y/2) and the expecta- 
tion in the denominator by their empirical counterparts. To that end, pick 
a kernel function 



K : [0, oo) ->■ R, / K(y)dy = l 

J\0,oo) 



[0,oo) 

and set Kh(y) = h~ l K(h~ l y) for y G [0, oo) and h > 0. Our estimator is 
defined as 

S n (rf = | ""^^fe-"/ 2 ' , ( 15 ) 

2 » £»€«„ <»,{„> j,/2} V" 

where to > is a threshold that ensures that the estimator is well defined 
in all cases and x\J y = max{x,y}. Thus (B n (y),y G V) is specified by the 
choice of the kernel K, the bandwidth h > and the threshold w > 0. 

Assumption 2. T/ie function K has compact support, and for some integer 
no > 1, we have Jj ^ x k K(x)dx = l{k=o} f or < A; < no- 

3.3 Error estimates 

We assess the quality of B n in squared-loss error over compact intervals 
T>. We need to specify local smoothness properties of B over D, together 
with general properties that ensure that the empirical measurements in (15) 
converge with an appropriate speed of convergence. This amounts to impose 
an appropriate behaviour of B near the origin and infinity. 

Model constraints 

For A > and a vector of positive constants c = (r,m, £, L), introduce the 
class T x (c) of continuous functions B : [0, oo) — > [0, oo) such that 



rr/2 rv 

\ x~ 1 B(2x)dx < L, / x~ x B 

Jo Jr/2 



(2x)dx > I, (16) 
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and 

B(x) >mx x for x > r. (17) 

Remark 2. Similar conditions on the behaviour of B can also be found in 
[9], in a deterministic setting. 

Remark 3. Assumption 1 is satisfied as soon as B e J- X (c). As mentioned 
in Remark 1, there are arbitrarily many divisions for sufficiently large n, 
thanks to (17) and our observation scheme U n is thus well-defined under 
(17). 

Let pmm, /0 max be two probability measures on £ . We define M.(p m \ a , pmax) 
as the class of Markov transitions p(v, dv') on £ such that 

Pmin{A) < p(v, A) < p max (A), Ac£,ve£. (18) 

Remark 4. Control (18) ensure the geometric ergodicity of the process of 
variability in the growth rate. 

Let us be given in the sequel a vector of positive constants c = (r, m, £, L) 
and < e m i n < e max such that £ C [e m i n , e max ]. We introduce the Lyapunov 
function 

V(x,v) = V(a?) =exp(-^-x A ) for (x,v)eS. (19) 

The function V controls the rate of the geometric ergodicity of the chain 
with transition Vb and will appear in the proof of Proposition 4 below. 
Define 

Assumption 3 (The sparse tree case). Let A > 0. We have 5(c) < 1. 

In the case of the full tree observation scheme, we will need more strin- 
gent (and technical) conditions on c. Let 7b 5 v denote the spectral radius of 
the operator Vb — 1 <8> vb acting on the Banach space of functions g : S — > M 
such that 

sup{\g(x)\/Y(x),x G S} < oo, 
where V is defined in (19) above. 

Assumption 4 (The full tree case). We have 5(c) < \ and moreover 

sup 7b,v < \- (20) 

Remark 5. It is possible to obtain bounds on c so that (20) holds, by using 
explicit (yet intricate) bounds on jb,V following Fort et al. or [14], Douc et 
al. [8], see also Baxendale [3]. 
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Rate of convergence 

We are ready to state our main result. For s > 0, with s = [sj + {s}, 
< {s} < 1 and [s\ an integer, introduce the Holder space T-L S (V) of 
functions / : T> — > K possessing a derivative of order that satisfies 

\fV»\( y ) - fV*\{x)\<c(f)\x-y\W. (21) 

The minimal constant c(/) such that (21) holds defines a semi-norm \f\w(D)- 
We equip the space H S (V) with the norm 

\\f\\n s (v) = ||/IU°°cd) + \f\w(v) 

and the Holder balls 

n s (V,M) = {B, \\B\\ n .(p) <M},M> 0. 

Theorem 2. Work under Assumption 3 in the sparse tree case and Assump- 
tion 4 in the full tree case. Specify B with a kernel K satisfying Assumption 
2 for some uq > and 

h = c n- 1 ^ 2s+1 \ w n = (logn)-\ 

For every M > there exist cq = cq(c, M) and d(c) > such that for every 
< s < no and every compact intervalT) C (d(c),oo) such that inf T> > r/2, 
we have 

supE4||B n - B\\l Hv) ] l/2 < (logn)n- s /( 2s+1 ), 
p,B 

where the supremum is taken over 

P G M{ Pmin , Pmax ) and B G ^ A (c) n U S {V, M), 

and IE/4'] denotes expectation with respect to any initial distribution fi(dx) 
for (^0,T0) on S such that f s Y(x) 2 fi(dx) < oo. 

Several remarks are in order: 1) We obtain the classical rate n~ s ^ 2s+1 ^ 
(up to a log term) which is optimal in a minimax sense for density estima- 
tion. It is presumably optimal in our context, using for instance classical 
techniques for nonparametric estimation lower bounds on functions of tran- 
sition densities of Markov chains, see for instance [15]. 2) The extra loga- 
rithmic term is due to technical reasons: we need it in order to control the 
decay of correlations of the observations over the full tree structure. 3) The 
knowledge of the smoothness s that is needed for the construction of B n is 
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not realistic in practice. An adaptive estimator could be obtained by using 
a data-driven bandwidth in the estimation of the invariant density i/#(y/2) 
in (15). The Goldenschluger-Lepski bandwidth selection method [16], see 
also [10] would presumably yield adaptation, but checking the assumptions 
still requires a proof in our setting. We implement data-driven bandwidth 
in the numerical Section 4 below. 

4 Numerical implementation 

4.1 Protocol and results 
Generating simulated data 

Given a division rate B(x), a growth rate kernel p, an initial distribution 
p(dx) for the node (^$,t^) (as in Theorem 2) and a dataset size n = 2 Nn , 
we simulate the full tree and the sparse tree schemes recursively: 

1. Given (£ u -,t u -), we select at random its lifetime £«- (by a rejection 
sampling algorithm) with probability density 

t ~» B{£ u - exp(r w -i)) exp ( - J B{^ u - exp(r u - s)) ds^j. 

following the computations of Section 3.2. 

2. We derive the two sizes at birth £ u (with u = (u~, 0) and (u~, 1)) by 
Formula (8). 

3. We simulate at random the growth rates t u (by the rejection sampling 
algorithm) according to the distribution p(r u -,dv). 

4. For the sparse tree case, we select only one offspring (either (u~, 0) of 
(u~ , 1)), whereas we keep both for the full tree case. 

In order to stay in line with previous simulations of [10] we pick B(x) = x 2 . 
We fix p(dx) as the uniform distribution over [1/3, 3] x £, with £ = [0.2, 3]. 
As for the growth rate kernel, we implement 

p(v, dv') = g(v — v)dv' 

where g is a uniform distribution over [1 — a, 1 + a] for some a > 0, and 

1 /2 

dilated by a scaling factor so that (f(v' — v) 2 p(v,dv')) 1 = 1/2. We also 
condition the values of t u to stay in £ (by rejection sampling). 



16 



Implementing B n 

We implement B n using Formula (15). We pick a standard Gaussian kernel 
K(x) = (27r) -1 / 2 exp(— x 2 /2), for which no = 1 in Assumption (2); hence- 
forth we expect a rate of convergence of order ra -1 / 3 at best. We evaluate 
B n on a regular grid x\ = Ax, ■ ■ ■ x m , = mAx with Ax = n _1//2 and x m = 5. 
Thus x m is large enough so that vb(x/2) becomes negligible for x > x m and 
Ax is smaller than n~ 1//3 to avoid numerical discrepancies. For tractability 
purposes, we wish to avoid the use of any relationship between the nodes 
u G U n . Indeed, whereas it is quite easy to label u~ and u in the sparse tree 
case, it is a bit more difficult to track the parent of each individual in the 
full tree case if we do not want to double the memory. As a consequence, 
we simply reformulate (15) into 

Uv) = l '•'^^-^ . (22) 

We take h n = n -1 / 3 for the bandwidth according to Theorem 2 to serve as 
a proof of concept. Data-driven choices could of course be made, such as 
the Goldenschluger and Lepski's method [16, 10], and improve the already 
fairly good results shown in Figure 2. Finally, we also test whether taking 
into account variability in the growth rate improves significantly or not the 
estimate of B, replacing t u by its mean value n~ l ^2 ueUn r u everywhere in 
Formula (22), thus ignoring growth variability in that case. 

Numerical results 

We display our numerical results as specified above in Figures 1, 2 and 
3. Figure 1 displays the reconstruction of B on the full tree scheme for a 
simulated sample of size n = 2 17 . At a visual level, we see that the estimation 
deteriorates dramatically when the variability is ignored in the region where 
i>b is small, while our estimator (22) still shows good performances. 

In Figure 2, we plot on a log-log scale the empirical mean error of our 
estimation procedure for both full tree and sparse tree schemes. The nu- 
merical results agree with the theory. The empirical error is computed as 
follows: we compute 

H-B - -BllAx.m i; = 1j _ jM> ( 2 3) 



LB I 



Ax,m,vj 



where || • Hae,™,^ denotes the discrete norm over the numerical sampling de- 
scribed above, conditioned on the fact that the denominator in (22) is larger 
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Figure 1: Reconstruction for n = 2 17 and w = n~ 1 / 2 . When the variabil- 
ity in the growth rate is ignored, the estimate reveals unsatisfactory. The 
parameter values are the reference ones. 



log 2 (ra) 


5 


6 


7 


8 


9 


10 


e 


0.2927 


0.1904 


0.1460 


0.1024 


0.0835 


0.0614 


std. dev. 


0.1775 


0.0893 


0.0627 


0.0417 


0.0364 


0.0241 



Table 1: Relative error e for B and its standard deviation, with respect to n 
(on a log scale). The error is computed using (23) with w = l/log(n). 

than w\j log(n). We end up with a mean-empirical error e = M _1 Xa=i e *- 
The number of Monte-Carlo samples is chosen as M = 100. In Figure 3, 
we explore further the degradation of the estimation process on the region 
where ub is small, plotting 95% confidence intervals of the empirical distri- 
bution of the estimates, based on M = 100 Monte-Carlo samples. Finally, 
Table 4.1 displays the relative error for the reconstruction of B according to 
(23). The standard deviation is computed as (M~ l Yli=i ( e « Y^ 2 - We 
also carried out control experiments for other choices of variability kernel 
p(v,dv') for the growth rate. These include p{v,dv') = g(v')dv', so that 
the variability of an individual is not inherited from its parent, a Gaussian 
density for g with the same prescription for the mean and the variance as 
in the uniform case, conditioned to live on [e m i n , e max ]- We also tested the 
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n 



Figure 2: Error v.s. n for the full tree and the sparse tree case on a log-log 
scale. The error actually proves better than the upper rate of convergence an- 
nounced in Theorem 2, and w may be taken smaller than log(n). Estimates 
are comparable for both schemes. The parameter values are the reference 
ones. 

absence of variability, with p(v,dv') = 5 T (dv'), with r = 1. None of these 
control experiments show any significant difference from the case displayed 
in Figures 1, 2 and 3. 

Analysis on E. coli data 

Finally, we analyse a dataset obtained through microscopic time-lapse imag- 
ing of single bacterial cells growing in rich medium, by Wang, Robert et al. 
[31]. Thanks to a microfluidic set-up, the experimental conditions are well 
controlled and stable, so that the cells are in a steady state of growth (so- 
called balanced growth). The observation scheme corresponds to the sparse 
tree case: at each generation, only one offspring is followed. The growth 
and division of the cells is followed by microscopy, and image analysis al- 
lows to determine the time evolution of the size of each cell, from birth to 
division. We picked up the quantities of interest for our implementation: 
for each cell, its size at birth, growth rate and lifetime. We consider that 
cells divide equally into two daughter cells, neglecting the small differences 
of size at birth between daughter cells. Each cell grows exponentially fast, 
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Figure 3: Reconstruction for n = 2 10 , error band for 95%, full tree case, 
over M = 100 simulations, with w = 1/n in order to emphasise that the 
larger x, the smaller and the larger the error estimate. 

but growth rates exhibit variability. 

Our data is formed by the concatenation of several lineages, each of them 
composed with a line of offsprings coming from a first single cell picked 
at random in a culture. Some of the first and last generations were not 
considered in order to avoid any experimental disturbance linked either to 
non stationary conditions or to aging of the cells. 

We proceed as in the above protocol. Figure 4 shows the reconstructed 
B and vb for a sample of n = 2335 cells. Though much more precise 
and reliable, thanks both to the experimental device and the reconstruction 
method, our results are qualitatively in accordance with previous indirect 
reconstructions carried out in [11] on old datasets published in [21] back in 
1969. 

The reconstruction of the division rate is prominent here since it appears 
to be the last component needed for a full calibration of the model. Thus, 
our method provides the biologists with a complete understanding of the 
size dependence of the biological system. Phenotypic variability between 
genetically identical cells has recently received growing attention with the 
recognition that it can be genetically controlled and subject to selection 
pressures [20]. Our mathematical framework allows the incorporation of this 
variability at the level of individual growth rates. It should allow the study of 
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the impact of variability on the population fitness and should be of particular 
importance to describe the growth of populations of cells exhibiting high 
variability of growth rates. Several examples of high variability have been 
described, both in genetically engineered or natural bacterial populations 
[29, 30]. 



0.4r 




Figure 4: Estimation of B (dotted line) and v% (solid line) on experimental 
data of E. coli dividing cells, n = 2335. In abscissae, the bacterial length is 
in arbitrary unit. 



4.2 Link with the deterministic viewpoint 

Considering the reconstruction formula (15), let us give here some insight 
from a deterministic analysis perspective. For the sake of clarity, let us focus 
on the simpler case when there is no variability, so that for all u G U n we 
have t u = t > a fixed constant. Formula (15) comes from (14), which in 
the case t u = t simplifies further into 

B(y) = % r My/2) T = ? n vM t 2 l ■ (24) 



L {? u -<y, £u>y/2} 



Sy/2 V B{z)dz' 



We also notice that, in this particular case, we do not need to measure the 
lifetime of each cell in order to implement (24). Define N(y) = | , or 

equivalently vb{x) = 2B(2x)N(2x). Differentiating (24), we obtain 

d x (rxN) = 2B(2x)N(2x) - B(x)N(x) 
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which corresponds to the stationary state linked to the equation 



dtn(t, x) + Td x (xn(t, x)) = 2B(2x)n(t, 2x) — B(x)n(t, x), 

(25) 

n(0,x) = nW(x),x > 0. 

Eq. (25) exactly corresponds to the behaviour of the tagged cell of Section 
5.1 below, in a (weak) sense: 

n(t,dx) = P(x(i) G dx) 

where %(i) denotes the size at time t along a branch picked at random, 
see Section 5.1. Existence and uniqueness of an invariant measure vb has 
an analogy to the existence of a steady state solution for the PDE (25), 
and the convergence of the empirical measure to the invariant rejoins the 
stability of the steady state [19]. The equality vb{x) = 2B(2x)N(2x) may be 
interpreted as follows: N(x) is the steady solution of Eq. (25), and represents 
the probability density of a cell population dividing at a rate B and growing 
at a rate xt, but when only one offspring remains alive at each division so 
that the total quantity of cells remains constant. The fraction of dividing 
cells is represented by the term B(x)N(x) in the equation, with distribution 
given by ^ub{x/2), whereas the fraction of newborn cells is 2B(2x)N{2x). 
Eq. (24) can be written in terms of BN as 



ryBNjy) 
r y B(z)N(z)dz 



m = r2 ;r;:^ . • (26) 



This also highlights why we obtain a rate of convergence of order n~ s ^ 2s+1 ^ 
rather than the rate n - s /( 2s + 3 ) obtained with indirect measurements as in 
[10]. In that latter case, we observe a n-sample with distribution N. As 
shown in [10], one differentiation is necessary to estimate B therefore we 
have a degree of ill-posedness of order 1. In the setting of the present paper, 
we rather observe a sample with distribution BN, and B can be recovered 
directly from (26) and we have here a degree of ill-posedness of order 0. 



5 Proofs 

The notation < means inequality up to a constant that does not depend on 
n. We set a n ~ b n when a n < b n and b n < a n simultaneously. A mapping 
/ : £ — > M or g : [0, oo) — >• R is implicitly identified as a function on S via 
f(x, v) = f(x) and g(x, v) = g(v). 
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5.1 A many-to-one formula via a tagged cell 

For u E U, we set m l u for the i-th parent along the genealogy of u. Define 
_ M 

T~t = ^2 T rn*u(m*u + T t{ 1 ~ K) for t G [b u , b u + (u) 
i=l 

and otherwise for the cumulated growth rate along its ancestors up to time 
t. In the same spirit as tagged fragments in fragmentation processes (see 
the book by Bertoin [4] for instance) we pick a branch at random along the 
genealogical tree at random: for every k > 1, if denotes the node of the 
tagged cell at the A:-th generation, we have 

P(i?fc = u) = 2~ k for every u £U such that \u\ = k, 

and otherwise. For t > 0, the relationship 

b# Ct < t < b$ Ct + Q Ct 

uniquely defines a counting process (Ct,t > 0) with Co = 0. The process Ct 
enables in turn to define a tagged process of size, growth rate and cumulated 
growth rate via 



(x(t),V(t),V(*)) = (Ct Ct ,rf c \rf Ct ) for t G [&* 0t ,&, C( + Q Ct ) 

and otherwise. We have the representation 

xe v(t) 

x(t) = icr (27) 

and since V(t) G [e m i n ,e max ], we note that 

e min t < V(t) < e max t. (28) 

The behaviour of (x(t),V{t),V(t)) can be related to certain functionals of 
the whole particle system via a so-called many-to-one formula. This is the 
key tool to obtain Theorem 1. 

Proposition 3 (A many-to-one formula). Work under Assumption 1. For 
x £ (0, oo), let ¥ x be defined as in Lemma 1. For every t > 0, we have 

— r" 

M0(*(t),V(t),V(t))] =E x \Y,^—H^,r^W) 
for every cfi : S x [0, oo) — > [0, oo). 
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Proof of Proposition 3. For v £ U, set I v = [b v ,b v + Q). By representation 
(27), we have 



E x [<f>( x (t),V(t), V(t))] = E x [<f>(^, V(t), V(t))] 



E 



Introduce the discrete filtration % n generated by (£ u , („, r M ) for every k such 
that \u\ < n. Conditioning with respect to M\ v \ an d noting that on {t E I v }, 
we have 



2M 
we derive 

E * [ E * ( W > T t ^) Htel, ,«=*o, }] = E *[E& 



,r 



e b v 



xe t v v\ 1 



= E 



— T U 



ueu 



□ 



5.2 Proof of Theorem 1 

We fix x G (0, oo) and first prove the result for an initial measure [i x as in 
Proposition 3. Let ip G Cg(5) be nonnegative. By (10) we have 

oo 

{n{t, •), <p) =E a; [E vM*), = E4 E r ")] 



i=l 



and applying Proposition 3, we derive 



(n(t,-),<p)=xM x <p( X (t),V(t)) 



ueu 



x(t) 



(29) 



For h > 0, introduce the difference operator 

A h f(t) = h- 1 {f(t + h)-f(t)). 

We plan to study the convergence of A/ l (n(t, •), <p) as h Q using repre- 
sentation (29) in restriction to the events {C t +h — Ct = i}, for i = 0, 1 and 
{Ct+h — Ct > 2}. Denote by J 7 * the filtration generated by the tagged cell 
{x( s )i V(s), s < i) . The following standard estimate proved in Appendix 6.2 
will be later useful. 
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Lemma 1. Assume that B is continuous. Let x £ (0, oo) and let [i x he a 
probability measure on S such that fi x ({x} X £ ) = 1. Abbreviate by ¥ x . 
For small h > 0, we have 

F x (C t+h -C t = l\T t ) = B( x (t))h + he(h), 

with the property \e{h)\ < e(h) — > as h — > 0, for some deterministic e{h), 
and 

¥ x {C t+h -C t >2)<h 2 . 

Since <p 6 Cq(5), there exists c((p) > such that tp(y,v) = if y > c((/?). 
By (28), we infer 



f{x{t),V{t))— — <supip{y,v) — 

X(t) y,v c{tp) 



By Lemma 1 and (30), we derive 



V(t). 



^{x(t),v{t)) e —)i {Ct+h „ Ct > 2] 



<h. 



(30) 



(31) 



On the event {Ct+h — Ct = 0}, the process V(s) is constant for s£ [t,t + h) 



and so is thanks to (27). It follows that 



X(s) 



e V(t) , 

h (<p[ X (t), V(t)) ^y) = A h¥ >( X (t), V(s)) i 



on {Ct+ft — Cf = 0} and also 



A^( x (t),V(s)) 



a V(t) 



X(*) 



< sup \d y (p(y,v)\xe T 



,= t x{t) 

exp(2e max £) 
c((p) 



on {Ct+h — Ct = 0} likewise. Since P a; (Ct + / l — Ct = 0) — > 1 as /i — > 0, by 
dominated convergence 



A 



h(p(x(t),V(t)) 



e V(t) 



-{C t +h-Ct=0} 



x^ x [d lV {x{t),V{t))V{t)e v ^] as -»■ 0. 



(32) 



By Proposition 3 again, this last quantity is equal to {n(t,dx,dv),xvd x ip). 
On {Ct+h — Ct = 1}, we successively have 

X(t + h) = ~ x (t)+ £l (h), 
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<p( X (t + h),V(t + hj) = <p{ X (t)/2, V(t + h)) + e 2 (h) 



and 



exp (V(t + h)) = exp (V(t)) + e 3 (h) 

with the property < ei(h) — > as h — > 0, where e\{h) is deterministic, 

thanks to (27) and (28). Moreover, 



It follows that 



= E 3! 



V(t + /i) = on {Ci+fc - C t = 1}. 



<p{x(t + h), V(t + h)) -j—^l {Ct+h _ Ct=1} 



<p{x(t)/2,Tt Ct+l ) 



2e V ^ 

w 

2e v(t) 



L {C t+h -Ci=l} 



+ e 2 (/») 



yj(x(t)/2,r tfCt+1 )— ^l{c t+h -C t >i}J +£3 00 



where e<2,{h)-,e^{K) — > as /i — ► 0, and where we used the second part of 
Lemma 1 in order to obtain the last equality. Conditioning with respect 
to Ft V T $c t +i an d using that {Ct+h — Ct > 1} and t# c +1 are independent, 
applying the first part of Lemma 1 , this last term is equal to 



Ex 



<p( X (t)/2, T 6ct+l ) --^-B{x{t))h\ + e 3 (h) 

J <p[x{t)/2, v')p(V(t), dv') —B(x(t))h\ + H (h) 



where e±{h) — > as h — > 0. Finally, using Lemma 1 again, we derive 

e v{t). 



E x [A h ^( x (t),V(t)) ) l {Ct+h -c t =i}_ 

^E x [^j^{x(t)/2,v')p{V(t),dv')-^{x(t),V(t)))^B(x(t))] (33) 
as h — >• 0. By Proposition 3, this last quantity is equal to 

(n(t,dx,dv), ( y 2ip(x/2, v')p(v, dv') — <p(x, v ))B(x)) 
which, in turn, is equal to 

(n(i, 2dx, dv), / 4<^(x, v')p(v , dv')B(2x)) — (n(t, dx, dv), cp(x, v)B(x)) 
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by a simple change of variables. Putting together the estimates (31), (32) 
and (33), we conclude 

dt(n(t, dx, dv), (p) — (n(t, dx, dv), xvd x ip) + (n(t, dx, dv)B(x), if) 



which is the dual formulation of (4). The proof is complete. 

5.3 Geometric ergodicity of the discrete model 

We keep up with the notations of Sections 2 and 3. We first prove Proposi- 
tion 2. 

Proof of Proposition 2 

The fact that VB(dx) = VB(x,dv)dx readily follows from the representa- 
tion Vb(x, dx') = Vb ((x, v), x' , dv')dx' together with the invariant measure 
equation (12). It follows that for every y G (0,oo), 



< 





By Assumption 1, we have J" a 




oo hence 




It follows that VB{y,dv') is equal to 



B(2y) 



II 



/ / VB(x,dv)dx -A_iexp(-/ -^ds)ds 



B(2y) 




y 



Integrating with respect to dv' 



we obtain the result. 
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Geometric ergodicity 

We extend Vb as an operator acting on functions / : S — > [0, oo) via 

V B f(x) = J f(y)V B (x,dy) 
If k > 1 is an integer, define V B = V'f 1 o 7> B . 

Proposition 4. Lei c satisfy Assumption 3. Then, for every B £ .F^(c) and 
p G A1(p m in)j i/iere exists a unique invariant probability measure of the form 
ub(cIx) = i>B(x,dv)dx on S. Moreover, there exist < 7 < 1, a function 
V : S — > [1, 00) and a constant R such that 

sup sup \V k B g{x)- I g{z)u B {dz)\ < RY(x)j k (34) 

BeF>-(c),peM(p min )\g\<V Js 

for every x £ S, k > 0, and where the supremum is taken over all functions 
g : S — > M satisfying \g(x)\ < Y(x) for all x £ S. Moreover, under Assump- 
tion 4, we can take 7 < \. Finally, the function V is us-integrable for every 
B £ J- X (c) and (34) is well defined. 

We will show in the proof that the function V defined in (19) satisfies 
the properties announced in Proposition 4. 



Proof of Proposition 4 

We follow the classical line of establishing successively a condition of minori- 
sation, strong aperiodicity and drift for the transition operator Vb (see for 
instance [24, 3, 14]. We keep in with the notation of Baxendale [3]). Recall 
that < e min < e max is such that £ C [e min , e max ]. 

Minorisation condition. Let B £ ^(c). Define 

My) = ^ M exp(- r^ds). (35) 

emax?/ Jo mm 

Set C = (0, r) x £, where r is specified by c. For any measurable X x A C S 
and (x, v) £ C, we have 

Vb((x,v),XxA)= [ p(v,dv') [ ^exp(- I'" ^ds)dy 

Ja Jxn\x/2,oo] vy J x/2 

> Pmm(A) / ip B (y)dy. 

JXn[r/2,oo] 
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Define 

T B (dy,dv) = c~ B l ly r/2)0o) {y)tp B {y)dy pminidv), 

where 

c B = r<PB(y)dy>^e X p{-^ r )=:P>0 

c max 

by (16) since B £ .F^(c). We have thus exhibited a small set C, a probability 
measure Tb and a constant f3 > so that the minorisation condition 

Pb^ii^xA) >^r B (A-x A) (36) 

holds for every (x, v ) G C and x i C 5, uniformly in 73 € J-* A (c). □ 

Strong aperiodicity condition. We have 

PT B {C) = Pc£ / Pmin (dv) f ^exp(- T 

■/£ Jr/2 e max2/ A/2 

____ /»r 

> /Jc^ 1 / (p B (y)dy 

Jr/2 

^ll-«P(-f>)) 
Jr/2 

>^(l-exp(-^))=: i 9>0 (37) 

where we applied (16) for the last inequality. □ 

Dri/t condition. Let € .F*(c). Let V : S — > [l,oo) be continuously differ- 
entiable and such that for every v G £ , 

lim%,t;)exp(-2 A ^ A )=0. (38) 

For x > r, by (17) and integrating by part with the boundary condition 
(38), we have, for every v € £ , 

V B Y(x,v)= [ p(v,dv') r V ( y y)^Mexp(- ^ds)dy 

J£ Jx/2 V U Jx/2 

f' /*oo f*y 

< / p(v,dv') / d y Y(y,v')e W {-^ / s x - l ds)dy 

J£ Jx/2 Jx/2 

< (%x X ) J s p(v, dv>) jZ^ {x/2)x V((^) 1A , v') e^dy. 
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Pick Y(x,v) = Y(x) = exp (- iIL T x A ) defined in (19) and note that (38) is 
satisfied. With this choice, we further infer 

P f'OO 

V B Y(x,v) < Y(x,v) / P (v,dv') / , exp ( - (1 - 2- x )y)dy 

^ V (*> v )l3^A exp ( - (1 - 2- A )^r A ) 
since x > r. Recall that 

^ = r= ^ eXp (- (1 - 2_A) ^ rA )- 
We obtain, for x > r and v G £ 

V B Y(x,v) <5(c)Y(x,v) (39) 

and we have 5(c) < 1 by Assumption 3. We next need to control V B Y 
outside x G [r,oo), that is on the small set C. For every (x,v) G C, we have 

PbV(x,i;)< / p(v,dt/)f [ 7/ \(y,v')^^-dy 




< e ^ sup Y(y)L + /z(c)V(r/2) =: Q < oo (40) 

ye[0,r] 

where we used (16) and the fact that B € J- X (c). Combining together (39) 
and (40), we conclude 

V B Y(x) < 5(c)Y(x)l {xm + Ql {xec} . (41) 

□ 

Completion of proof of Proposition 4- The minorisation condition (36) to- 
gether with the strong aperiodicity condition (37) and the drift condition 
(41) imply inequality (34) by Theorem 1.1 in Baxendale [3], with R and 7 
that explicitly depend on 5(c), (3, (3, V and Q. By construction, this bound 
is uniform in B G T (c) and p G A1(/9 m in)- More specifically, we have 

7 < min{max{(5(c),pv, J B}, 1} 

therefore under Assumption 3 we have 7 < 1 and under Assumption 4, we 
obtain the improvement 7 < ^. □ 
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5.4 Further estimates on the invariant probability 

Lemma 2. For any c such that Assumption 3 is satisfied and any compact 
interval V C (0, oo), we have 

sup sup vb(x) < oo, 

BeJ A (c)n« s (D,Af) x^^v 

with vb{x) = J £ t> B (x,dv). 

Proof. Since B G J-^(c), vb is well-defined and satisfies 



/ / VB(y,dv')dyex.p(- / ^fds) y 



Hence v B (x,dv) < B(2x)(e ra i n x)~ 1 p Tna _ x (dv) and also v B (x) < S(2x)(e m i n x) _1 . 
Since -B € H S (V,M) implies sup^-i© B(2x) = [|B[|x,oo(x>) < Afj the con- 
clusion follows. □ 

Lemma 3. For any c suc/t i/mi Assumption 3 is satisfied, there exists a 
constant d(c) > such that for any compact interval T> C (d(c), oo), we have 

inf inf (Pb(x)~ 1 ub(x) > 0, 
where (Pb( x ) is defined in (35). 

Proof. Let g : [0, oo) — > [0, oo) satisfy g(x) < Y(x) = exp( e m x x x ) for every 
x G [0, oo). By Proposition 4, we have 

SU P / g{x)vB{x)dx < oo, (42) 
se^(c) J[o,oo) 

as a consequence of (34) with n = 1 together with the property that 
sup B6 jr^ c ) VbY(x) < oo for every x G 5, as follows from (41) in the proof 
of Proposition 4. Next, for every x G (0, oo), we have 



^(y)dy < exp(-^(2x) A ) / V(y)v B (y)dy 

2x mm J [0,oo ) 

and this bound is uniform in B G -^(c) by (42). Therefore, for every 
x G (0, oo), we have 

/•oo 

sup / z, B (y)dy< c ( c )exp(-^(2x) A ) (43) 
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for some c(c) > 0. Let 

40>(^^logc(c)) 1/A l {c(c) > 1} . (44) 
By definition of vb-, for every x G (0, oo), we now have 

u B (x,dv) = / / v B (y,dv)exp{- -^-ds)dy 

x JS JO Jy/2 v 

B(2x) f x b(2 f 2x 

-~, — r ex P(- / S ds ) / "B(y)dy p mi n(dv) 

Cmax-^ JO JO 

* 7^ ex P ( " f S ds ) - c ( c ) exp(-^(2x)^)) Pmin (^) 
where we used (43) for the last inequality. By (44), for x > d(c) we have 

(l-c(c)exp(-^(2s)*))>0 
and the conclusion readily follows by integration. □ 

5.5 Covariance inequalities 

If it, w £ U, we define a(u, w) as the node of the most recent common 
ancestor between u and w. Introduce the distance 

D(u,w) = \u\ + \w\ — 2\a(u,w)\. 

Proposition 5. Work under Assumption 3. Let fi be a probability distribu- 
tion on S such that J s V{x) 2 p{dx) < oo. Let G : S — > M and H : [0, oo) — > M 
be two bounded functions. Define 

Z(L- ,t u -,Hu) = G{i u - , r u - )#(£„)- K B [G(L- , r u -)H({; u )] . 

For any u,w €U with \u\, \w\ > 1, we have 

\E„[Z^ u -,t u -,^)Z^ w -,t w -^ w )]\ < 7 D («.-) (45) 

uniformly in B G .P*(c), wzi/i 7 anc? z/g defined in (34) of Proposition 4- 

Proof. In view of (45), with no loss of generality, we may (and will) assume 
that for every (x,v) G 5 

|G(a;,u)| < V(x) and |fl"(a;)| < V(s) (46) 
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Applying repeatedly the Markov property along the branch that joins the 
nodes a~(u,w) := a(u~,w~) and w, we have 

[G(€u- ,t u -)H(£ u )\ £ a - , r a - ( U)1U )] 

_ -p^ I [a ^'"'^(GPB-H')(Ca-(?i,to) 5 r a-(?i,M))) 

with an analogous formula for G(£ w - , r^- )H(£ W ). Conditioning with respect 
to ( u ,w), T a - ( u ,w), it follows that 

e m [-2" > T u- , iu)Z{£ w - , t w - , £ w )} 

(^'"'^'(GP^fe-^),^-^)) -E ub [GV b H(^,t )))' . 
Applying Proposition 4 thanks to Assumption 3 and (46), we further infer 

< R 2 sup G(x,v) 2 H(x) 2 E^ a - (u>w) f]^ u ^ 

(x,v) 



< 



7? K(«.™)l( V 2 )( a; ) At (d a; )7 B ( M ^. 



We leave to the reader the straightfoward task to check that the choice of V 
in (19) implies that V 2 satisfies (41). It follows that Proposition 4 applies, 
replacing V by V 2 in (34). In particular, 

sup V l f {u ' w)l (V 2 )(x) < 1 + Y(x) 2 . (47) 

Since V 2 is /i-integrable by assumption, inequality (45) follows. □ 

Proposition 6. Work under Assumption 3. Let \xbe a probability on S such 
that f s Y(x) 2 [j,(dx) < oo. Let xq be in the interior of Let H : R — > M 
be bounded with compact support. Set 

H(^)=H(^)-E UB [H(^)]. 

For any u,w with \u\, \w\ > 1, we have 

|E M [H ^ u ^ x ° ) H ^ w ~^ x ° )] | < j°( u ' w *> ^D(u,o(«,w))vB(w,a(u,«>)) 

uniformly in B G .F (c) f~l % S {T>, M) for sufficiently small h > 0. 
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Proof. The first part of the estimate in the right-hand side of (48) is obtained 
by letting G = 1 in (45). We turn to the second part. Repeating the same 
argument as for (45) and conditioning with respect to £ a ( UjU ,) , we obtain 



( V h\-Hu,w)\ H ^ a( „,y-x j _ E ^ ^|^o^l _ (49) 



Assume with no loss of generality that \u\ < \w\ (otherwise, the same subse- 
quent arguments apply exchanging the roles of u and w). On the one hand, 
applying (34) of Proposition 4, we have 

\ P w-Hw)\ H ^.y-*o }_ EvB [h(^)]\ < m(tia(u W )h lwHa{u ' w)l - 

(50) 

On the other hand, identifying H as a function defined on S, for every 
(x, v) S S, we have 

\V B H(^)\ = \ T H(h-\y- X0 ))^Me, P (- f ^ds)dy 

Jx 2 V V Jx/2 



x/2 VV Jx/2 

[0,oo) c miny 



<e m L sup ^-hf \H(x)\dx<h. (51) 

y£{x +h supp(H)} V J [0,oo ) 

Indeed, since xq is in the interior of \T> we have {xq + hsupp(H)} C \T> 
for small enough h hence sup y( z^ Xo+hsupp ^ H ^ B(2y) < M. Now, since Vb is 
a positive operator and VbX = 1) we derive 

as soon as \u\ — |a(u, ty) | > 1, uniformly in B G -7~" A (c) Pi T-L S (T>,M). If 
\u\ = \a(u,w)\, since f £ vs{dx,dv) = VB(x)dx, we obtain in the same way 

\E„ b [H(^^)]\< [ \ H {^)\v B {x)dx<h (53) 

using Lemma 2. We have |E Vs [iJ( ^"7 a:o )] | < h likewise. Putting together 
(52) and (53) we derive 

\ v \u\-Mu,w)\ H ( Z«».y-*<> ) _ Eug [H(^)] | < h 
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and this estimate is uniform in B 6 J- (c) D 1~L S {T>, M). In view of (49) and 
(50), we obtain 

E IM [H(^)H(^)] < ^7 |wh|a(u,w)l E4v(e a(UjW) )]. 

We conclude in the same way as in Proposition 5. □ 

5.6 Rate of convergence for the empirical measure 

For every y £ (0,oo) and u £U with \u\ > 1, define 

D(y) =E m [^-l Uu _< 2y> s u > y} ] (54) 

and 

A»(v)w = n- 1 Y, ^{^u- < 2y, ^ > y} V ^ ( 55 ) 

u&A n 

Proposition 7. Work under Assumption 3 in the sparse tree case and As- 
sumption 4 in the full tree case. Let p be a probability on S such that 
J s Y(x) 2 p(dx) < oo. If 1 > w = zu n —7-0 as n — )■ oo, we have 

supE4(D n (y) ron -D(y)f] < n' 1 (56) 
y ev 

uniformy in B € F x {c) n rl s (2~ 1 V,M) and p G M(p m i n , p m3iX ). 

We first need the following estimate 

Lemma 4. Work under Assumption 3. Let d(c) be defined as in Lemma 3. 
For every compact interval V C (<i(c),oo) such that inf V < r/2, we have 

inf inf D(y) > 0. 

BeJ rX (c)nn s (2- 1 v,M) yev 

Proof of Lemma 4- By (13) and the definition of ipB in (35), we readily have 

D {y) = -^— l PB(yr 1 ^B(y)ew{-[ §^l ds )- 

Since B € .F (c) n % S {2^ 1 V, M), by applying (16) and (17) successively, we 
obtain 

[ su ? v B(2s) , , r /- SU P C 5 (2s) , 

JO c min' 5 t/r/2 c mm ,: ' 

<e^(£ + Mlog=??)<oo 
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where we used that inf 2? < r/2. It follows that 

S 6XP ( " [ S dS ) * ^ ( " e min(^ + Mlog ^)) > 

and Lemma 4 follows by applying Lemma 3. □ 
Proof of Proposition 7. Since D n (y) is bounded, we have 

{D n (y) mn - D{y)f < (D n (y) - D{y)f + l [Dn[y)< ^ n} . (57) 
Next, take n sufficiently large, so that 

< w n < q = h inf inf D(y) 

Z BeJ rX {c)r\H s (2- 1 V,M) v^ v 

a choice which is possible thanks to Lemma 4. Since 

{D n {y) < w n } C {D n (y) - D{y) < -q}, 

integrating (57), we have that E M \[p n (y) VJn — Z)(y)) 2 ] is less than a constant 
times 

E„ [(£>„(!/) - D{y)f] + F fl (\D n (y)-D(y)\>q), 

which in turn is less than a constant times E^[(.D n (y) — Z)(y)) 2 ]. Set 
G(x, v) = ^1{ x <2j/} and H(x) = l{ x > y } an d note that G and H are bounded 
on 5 (and also uniformly in y € T>). It follows that 

D n {y) - D(y) = n~ l £ (g(£ u - , r u - )H(^ U ) — E Ub [G(£ u - , r u - )H(£ U )] ^ . 

We then apply (45) of Proposition 5 to infer, with the same notation that 

E„[(£> n (y) - D(y)f] 
= n~ 2 E v[ Z (£u-,'r u -,£ u )Z(£ w -,T w -,£ w )} 

U,W&A n 

uniformly in y G P and -B G .F A (c). We further separate the sparse and full 
tree cases. 
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The sparse tree ease. We have Y,u, w &A n l° iu ' w) = Ei< Ml |«|<„ 7 IM ~ HI by 
Proposition 4, and this last quantity is of order n. □ 

The full tree case. We have n ~ 2 Nn , where N n is the number of generations 
used to expand U n . We evaluate 

= E E ^ iU ' W) f° r k = 0,...,N. (58) 

|tt|=fe w£U„ 

For = 0, we have 

M, = 1 + 2 7 + 4 7 2 + . . . + 2*S*» = ^gg^ =: cf> 7 (N n ). 

Under Assumption 4, by Proposition 4, we have 7 < | therefore (j)-y(N n ) 
is bounded as n — > oo. For fe = 1, if we start with the node u = (0,0), 
then the contribution of its descendants in (58) is given by 7 (iV n — 1), to 
which we must add 7 for its ancestor corresponding to the node u = and 
also 7^ 7 (iV n ) for the contribution of the second lineage of the node u = 0. 
Finally, we must repeat the argument for the node u = (0, 1). We obtain 

M = 2(^(N n - 1) + 7 + 7 2 <Py(N n - 1)) • 
More generally, proceeding in the same manner, we derive 

M k = 2 k ^(N n - k) + (7 + l 2 ^(N n — k)) + ... 

+ i + 7 i+ V 7 - fc + (i ~ 1)) + • • • + (7" + 7 fc+ V 7 (iVn - 1))) (59) 

for k = 1, . . . , N n , and this last quantity is of order 2 k . It follows that 

N n Nn 

£ 7 D( ^ ) = £A4<£2 fc <2^<n 

u,w&U n k=0 k=l 

and the conclusion follows likewise. □ 

Putting together the sparse and full tree case, we obtain the proposition. 

□ 

Proposition 8. Work under Assumption 3 in the sparse tree case and As- 
sumption 4 in the full tree case. Let ^ be a probability on S such that 
J s Y(x) 2 fi(dx) < 00. We have 

supE^[(K hn *My) - K hn *v B {y)) 2 } < I log KKnK)- 1 (60) 

uniformly in B G T (c). 
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Proof. We have, with the notation of Proposition 6 
^[{K K *v n (y) - K hn *v B (y)) 2 ] 

<(n/l„r 2 £ 7 D(«,w)^^ n7 D(tt,a(u,«,))VD( W ,«(tt,u,)) ( 6 1) 

by applying (48) of Proposition 6. It remains to estimate (61). 

The sparse tree case. We have a(u,w) = u if \u\ < \w\ and a(u,w) = w 
otherwise. It follows that 

2 1 < r,- 2 h- 1 o/ D ( u .™) 



E4(i^„ *£„(*/) - K^*^)) 2 ] < n- 2 h- 1 Y, 



7 



and since Eu,«,e«„ 7 D(u ' , " ) = £i<|u|,M<n 7 IH_HI is of order n as soon as 
7 < 1, we obtain the result. □ 

The full tree case. The computations are a bit more involved. Let us eval- 
uate 

^0(u,w) ^ ^B(«,a(ti,iu))VB(w,a(u,w)) 

|u|=fc w£U„ 

We may repeat the argument displayed in (59) in order to evaluate the 
contribution of the term involving j°( u > a ( u , w )) _ However, in the estimate 
A/fe, each term 7* + 7 4+1 7 (iV n — & + (i — 1)) in formula (59) maybe replaced 
by h n (7* + 70 7 (iV ri — A; + 1))) up to constants. This corresponds to the 
correction given by ^ n7 B ( M > a ( u >«')) vD ( w > a («. w )). As a consequence, we obtain 



(u,a(it,ui))V D(to,a(u,uj)) 



|«|=ft weU n 
k 

<2 k Y K(f + 70 7 (iV„ - fc + i - 1)) /\f (1 + 7^ 7 (iV n - k + i - 1)) 



k 



i=l 
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Define k* = [ ~^^r \ ■ We readily derive 

k k* k 

2 k J2 h n^f = 2 k (Y,h n + 7 i )<2 k h n \\0gh n \, 
i=l i=l i=fc*+l 

ignoring the second term if fc* + 1 > k. Going back to (61), it follows that 

= (nh n y 2 ^2 Y 7 B(«,to) A ^ i7 D(«,a(«,t«))VB(«;,o(it,tu)) 

k=0 \u\=kveU n 

<{nh n )- 2 Y2 k h n \\ogh n \ < \ \ogh n \{nh n )- 1 

k=0 

and the conclusion follows in the full case. □ 

Putting together the sparse and full tree cases, we obtain the proposition. 

□ 



5.7 Proof of Theorem 2 

From 



B n {2y) = y^= ,T w 



and 



we plan to use the following decomposition 

B n (2y)-B(2y)=y(I + II + III), 

with 

K hn -kv B {y) - v B {y) 



I 
II 



D{y) 

K hn * %(y) - K hn ★ v B {y) 

Dn{y)zu n 
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where D(y) and D n (y) w are denned in (54) and (55) respectively. It follows 
that 

\\Bn ~ B\\\ 2{v) = 2 J (B n (2y) - B{2y)fdy < IV + V + VI, 



with 



!V = J (K hn -ku B (y) - u B {y)) 2 -£sjdn 



V= f {K hn *v n (y) - K hn *v B {y)) 2 D n {y)- 2 y 2 dy 

\_ [D n (y) w - D{y)Y{K hn ^v B (y)Y{D n {y) m D{y)Y 2 y 2 dy. 

2 V 

2 

The term IV. We get rid of the term jjT^p by Lemma 4 and the fact that D 
is bounded. By Assumption 2 and classical kernel approximation, we have 
for every < s < uq 

!V i$ \\ K hn *"B- ^11^2(2-12,) < \vB\ns (2 -l v) hl S . (62) 

Lemma 5. Let T> C (0, oo) be a compact interval. Let B G J~*(c) for some 
c satisfying Assumption 3. We have 

II V B 1 1 U s (2-115) < V ; ( ^) l|- B ll« s (I')) 

/or some continuous function tp. 
Proof of Lemma 5. Define 

k B {x,y)= / exp(- / -^r/ds). 

Je v Jy/2 

If B e % s (£>), then x A B (x,y) £ V. S (2- 1 V) for every y £ [0,oo), and we 
have 

\\^B(-,y)\\n s (2- 1 V) < IpliV, \\B\\ H °(V),emm,e max ) 

for some continous function ipi. The result is then a consequence of the 
representation v B {x) = B ^ x ^ Jq X A B (x,y)dy. □ 

Going back to (62) we infer from Lemma 5 that \vb\ / h s {2- 1 t>) ls bounded 
above by a constant that depends on e m j n , e max , V and only. It 

follows that 

IV < h 2 : (63) 
uniformly in B G H S (D, M). □ 
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The term V. We have 

E„[V] < w~ 2 \V\ sup y 2 E M [(^r fcn *P„(y)-^ lft *i/B(y)) 2 ]. 
j / e2- 1 X' 

By (60) of Proposition 8 we derive 

^[V] ^vj-^logh^inK)- 1 (64) 
uniformly in B G .F^(c). □ 
77ie ierm V7. First, by Lemma 4, the estimate 

inf inf D n (y) v7 D(y) > w n 



holds. Next, 



sup \K hn *v B (y)\= sup 1/ ^„(-z - y)v B {z)dz\ 

< sup ^s(y)||i^|Ui([ ,oo)) (65) 

where 2~ 1 Vf ln = {y + z, y G 2~ 1 D, z G supp(i^/ ln )} C £>, for some compact 
interval P since IT has compact support by Assumption 2. By Lemma 2, 
we infer that (65) holds uniformly in B G T (c). We derive 

E^[y/]< ro - 2 sup E M [(D rt (y) rori - J D(y)) 2 ]. 

y62- 1 C 

Applying (56) of Proposition 7, we conclude 

M VI ] ^^n 2 n- 1 (66) 
uniformly in B G J-" A (c). □ 

Completion of proof of Theorem 2. We put together the three estimates (63), 
(64) and (66). We obtain 

E/i [||S n - BHiap,)] < /i 2s + w~ 2 \ log /i„| (n/i^)" 1 + ro^n" 1 

uniformly in B G J" A (c) n U S {V,M). The choice /i n ~ n _1 /( 2s+1 ) and the 
fact that grows logarithmically in n yields the rate to log 

terms. The proof is complete. □ 
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6 Appendix 

6.1 Construction of the discrete model 

Fix an initial condition x = (x, v) G S. On a rich enough probability 
space, we consider a Markov chain on the binary tree (t u ,u G tl) with 
transition p(v, dv') and initial condition v. if u = (ui, . . . , Uk) G U, we write 
ui = (ui, . . . , Uk, i), i = 0, 1 for the two offsprings of u; we set T0 = v and 

t«o ~ p(r u , dv') and r„i ~ p(r u , dv) 

so that conditional on r u , the two random variables t u q and r«i are in- 
dependent. We also pick a sequence of independent standard exponential 
random variables (e u ,u G U), independent of (r u ,u G 14). The model 
{(£,u,t u ),u G if) is then constructed recursively. We set 

£0 = x , h = 0, T = u and C0 = Fx,v( e 9) 
where F XyV {t) = J Q * B(x exp(vs))ds. For u €U and i = 0, 1, we put 

) Sua 5 ' ui 

To each node u G U, we then associate the mark 6 U , £ u , Tu) of the size, 
date of birth, lifetime and growth rate respectively of the individual labeled 
by u. One easily checks that Assumption 1 guarantees that the model is 
well defined. 

6.2 Proof of Lemma 1 

Note first that 

{C t+h -C t >l} = {t< b# Ct + Q Ct <t + h}. 
Since moreover = x exp (V(b# Ct ))2- Ct , it follows by (7) that 

nc t+h -c t >i\F t ) 

rt+h-b» Ct J( H )+.V(.) , , PS , V( H )+.'V(.') \ A 

Introduce the quantity i3(xe V *' b '' c t^ +V ^* b * c t'2~ Ct \ within the integral. 
Noting that V(b# c ) + V(t)(t — b$ c ) = V(t) we obtain the first part of 
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the lemma thanks to the representation (27) and the uniform continuity of 
B over compact sets. For the second part, introduce the (J-f)-stopping time 

T t = mi{s>t,C s -C t >l} 

and note that {C t+h - C t > 1} = {T t < t + h} G J~r t - Writing 

{C t+h - C t > 2} = {T t < t + h, T Tt < t + fc} 

and conditioning with respect to J-rt, we first have 

F(C t+h -C t >2) 

rt+h—Tt v(b )+aV{s) _ , 

</i sup J B(y)P(T t < t + h). 

y<xexp(2e ma x*) 

In the same way, P(Yf < t + h) < h and the conclusion follows. 
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